Robust Sparse Estimation Tasks in High Dimensions
نویسنده
چکیده
In this paper we initiate the study of whether or not sparse estimation tasks can be performed efficiently in high dimensions, in the robust setting where an ε-fraction of samples are corrupted adversarially. We study the natural robust version of two classical sparse estimation problems, namely, sparse mean estimation and sparse PCA in the spiked covariance model. For both of these problems, we provide the first efficient algorithms that provide non-trivial error guarantees in the presence of noise, using only a number of samples which is similar to the number required for these problems without noise. In particular, our sample complexities are sublinear in the ambient dimension d. Our work also suggests evidence for new computational-vs-statistical gaps for these problems (similar to those for sparse PCA without noise) which only arise in the presence of noise.
منابع مشابه
Robust Estimation in Linear Regression with Molticollinearity and Sparse Models
One of the factors affecting the statistical analysis of the data is the presence of outliers. The methods which are not affected by the outliers are called robust methods. Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers. Besides outliers, the linear dependency of regressor variables, which is called multicollinearity...
متن کاملComputationally Efficient Robust Sparse Estimation in High Dimensions
Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions. This problem is exacerbated in modern high-dimensional settings, where the problem dimension can grow with and possibly exceed the sample size. We consider the problem of robust estimation of sparse functionals, and provide a computationally and statistically efficient algor...
متن کاملMammalian Eye Gene Expression Using Support Vector Regression to Evaluate a Strategy for Detecting Human Eye Disease
Background and purpose: Machine learning is a class of modern and strong tools that can solve many important problems that nowadays humans may be faced with. Support vector regression (SVR) is a way to build a regression model which is an incredible member of the machine learning family. SVR has been proven to be an effective tool in real-value function estimation. As a supervised-learning appr...
متن کاملThink Global, Act Local When Estimating a Sparse Precision Matrix
Substantial progress has been made in the estimation of sparse high dimensional precision matrices from scant datasets. This is important because precision matrices underpin common tasks such as regression, discriminant analysis, and portfolio optimization. However, few good algorithms for this task exist outside the space of L1 penalized optimization approaches like GLASSO. This thesis introdu...
متن کاملThe Wide - Sense Parametric Coverage Estimator against the Distribution Mismatch Problem for Sparse Data
Robust parameter estimation of sparse data is generally applied to the tasks when data collection is time-consuming or of high cost. We point out a new problem caused by sparse data. We find that there may exists coverage mismatch between data samples and the population when the sample size is less than 20. We call it the distribution mismatch (DM) problem. In this study, we derive a wide-sense...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1702.05860 شماره
صفحات -
تاریخ انتشار 2017